Task 11

Data description

We will work with the dataset geopol from the previous tasks. Let us group the countries on whether they are African or not. From the previous tasks, we already now that the covariates may be split into several groups without loosing too much information. Therefore, let us consider only part of the covariates, namely popu, giph, ripo, rupo.

LDA

Let us perform the lda classification now.

The histograms below show that there is a small overlap of the given groups in the lda projection. Also we might see that the critical value of the projection used for splitting might be somewhere around 1.

ldahist(pred$x[,1], g = dataA$africa)

For a better insight, we might also look at the parition plots below. We see that the lda method splits the countries very well with respect to some pair of covariates, especially for the pair rupo, ripo.

partimat(africa~., data = dataA, method = 'lda')

Comparison with other methods

Finally, we will compare the lda method with some other methods which can be used, namely with glm, lm, and quadratic discrimination. For an easier visualisation, we will consider only 2 covariates, namely ripo, giph, which are somehow most important from the PCA analysis used before.

We might see that GLM and QDA attain better results and they use similar splitting rule. On the contrary LDA and LM follow very different splitting, but the LDA gives almost as good results as the GLM and QDA methods in the term of correctly classified countries and we might see that the LM is definitely the worst in this term.

Task 11

Jiri Havranek

2022-05-09

Data description

LDA

Comparison with other methods